January 25, 2013 15:49 WSPC/INSTRUCTION FILE 

3laninakonopasckl00920 



EIGENVECTOR LOCALIZATION AS A TOOL TO STUDY SMALL 
COMMUNITIES IN ONLINE SOCIAL NETWORKS 



FRANTISEK SLANINA 

Institute of Physics, 
Academy of Sciences of the Czech Republic, 
Na Slovance 2, CZ-18221 Praha, Czech Republic 

slanina@fzu. cz 

ZDENEK KONOPASEK 

Center for theoretical study, 
Charles University in Prague / Academy of Sciences of the Czech Republic, 
Jilskd 1, Praha, Czech Republic 

zdenek@konopasek. net 



We present and discuss a mathematical procedure for identification of small "commu- 
nities" or segments within large bipartite networks. The procedure is based on spectral 
analysis of the matrix encoding network structure. The principal tool here is localiza- 
tion of eigenvectors of the matrix, by means of which the relevant network segments 
become visible. We exemplified our approach by analyzing the data related to product 
reviewing on Amazon.com. Wc found several segments, a kind of hybrid communities of 
densely interlinked reviewers and products, which we were able to meaningfully interpret 
in terms of the type and thematic categorization of reviewed items. The method provides 
a complementary approach to other ways of community detection, typically aiming at 
identification of large network modules. 
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1. Introduction 

The complexity of our societies is studied by social analysts in various ways. Qualita- 
tive inquiries and case studies usually put little emphasis on formalized description, 
partly to avoid oversimplification, or even trivialization of the phenomena under 
study. On the other side, sophisticated mathematical procedures are increasingly 
used in order to grasp complexity in a specific way, as a formalized property of 
larger systems. One of the branches of the latter stream is represented by the anal- 
ysis of social networks using mathematical theory of graphs. Our approach adheres 
precisely to this field of research and yet, it follows slightly different direction than 
most efforts in contemporary network analysis. 

The purpose of this paper is twofold. First, we want to present a specific solution 
to a rather standard problem of social network analysis, which is identification of 
communities within complex networks. Second, we want to discuss some alternative 
perspectives on the concept of "social network" . We suggest that our method might 
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2 Localization in social networks 

provide a suitable tool for empirical research in respective directions, enabling the 
analyst to determine those "hot spots" within the network that usually escape 
attention. 

To make the wider methodological context of our paper clearer, let us start with 
some notes on networks and network analysis in contemporary sociology. The use 
of the term "network" in contemporary sociology vary from loose metaphors |17j to 
rather specific and technical meanings [23( I83j , compatible with the network science 
as understood in mathematics or physics. 

Social network analysis has a complex history, with roots involving the socio- 
metric analysis of Moreno in the beginning of 20th century, the Harvard researchers 
of the 1930s and 1940s who studied interpersonal configurations and cliques and, 
finally, the group of anthropologists based in Manchester who, roughly in the same 
time, instead of emphasizing integration and cohesion as their predecessors, focused 
on conflict and change, see [83], pp. 7-37. In 1960s, a key turn to mathcmatization 
occurred, which gave this field a new impulse and high ambitions. Today, encour- 
aged by the rise of interest in networks in other scientific disciplines, social network 
analysis is sometimes seen as an approach that may entirely redefine the social 
sciences, while integrating them into a broader interdisciplinary research program 
[H]. Formalized analytical procedures hugely contributed to the fact that social 
network analysis has become firm basis for social science discussions [HO]- How- 
ever, integration of mathematical analytic thinking with sociological imagination is 
an intricate task. As noted by |31| , the application of formalized methods of social 
network analysis is often marked by neglecting substantive and theoretical sociolog- 
ical consequences. Also, despite the growing popularity of mathematical modeling, 
qualitative, or ethnographic studies of "network sociality" [92] keep their relevance, 
hand in hand with quantitative approaches. 

Given this complicated background, our aim, in this paper, is rather modest. We 
want to introduce and illustrate a new mathematical method for identification of 
small parts of complex networks with higher level of commonalities and for studying 
their basic formal properties. As an example and possible field of application we 
have chosen networks of product reviewing on the Amazon.com portal. Here, the 
simplest possible ties structuring the network are the connections established by 
two reviews written on the same product. In other words, what reviewers may 
have in common is the product reviewed by them. The configurations when one 
product, e.g., a book or a CD, is reviewed by two or more reviewers are frequent, 
of course, and not much special. But if the same reviewers are similarly connected 
via some other items too, the situation gets more exciting. We can assume that 
network segments with higher density of such links represent small communities of 
reviewers with similar interests. Our first and main objective is to find these small 
communities. 

Identification of such small-size groupings has always been one of the key tasks 
in social network analysis. Identification of these network segments is an interest- 
ing empirical finding in itself. Other times, however, the need to focus on smaller 
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network segments is rather methodological than substantial or theoretical: for in- 
stance, when David and Pinch |22j analyzed the phenomenon of review plagiarism 
on Amazon.com, they had to "localize" the phenomena in order to make it better 
graspable in detail. Thus, they had to reduce their sample while focusing on those 
segments of the vast amount of data available in which reviewed products were 
"somewhat similar to one another" and thus vulnerable to "recycling" practices 
they were interested in. This is a characteristic situation: complex networks, in- 
cluding the social ones, are quite often huge, only hardly analyzable in details, with 
respect to local deviations or little extremities. This is especially true for on-line 
networks. When studying internet-related network structures, analysts can quickly 
become overloaded with data and it is difficult to tell what exactly to look at. The 
urgent question becomes: how to locate tiny islands of relevance in the ocean of 
data archived on the Internet? We offer a possible mathematical method for pre- 
cisely such a task - a more flexible and background-sensitive one (a "softer" one, 
in a way) than those already described and used in the field. 

We should also stress at this point that our task differs from the well-studied 
problem of splitting the network into several modules, which may perhaps overlap, 
but as an ensemble, they cover the network entirely. This is the case in metabolic 
networks, to mention just one example j721 146j . In our case, we want to focus on a 
few "hot spots" , small communities of interest within the network, leaving all the 
rest behind. 

2. Reviewing networks on Amazon.com as a sociological problem 

Before demonstrating the mathematical procedures, let us also briefly mention some 
sociological contexts of the chosen example. Sociologists have pointed out the in- 
creasing importance of the symbolic content of contemporary economics, which is 
often associated, among others, with users' or consumers' active involvement in the 
complex processes of product evaluation, qualification, and formation [21 [59] . The 
role of consumers is particularly enhanced by the Internet and by the ways computer 
technologies shape social networks [15]. 

A specific and significant part of these processes has been recognized as "peer- 
production of relevance/accreditation" [8], p. 75 or simply as "reputational econ- 
omy" |22| . By reviewing or commenting items in on-line shops, classifying and 
rating them, individual consumers become co-producers of coordinates for others' 
economic decision making. They engage in a complex action that cannot be simply 
grasped in purely economic terms. As noted by [62], p. 322, spaces of E-commerce 
are characteristic by countless devices creating diversity of forms of encounter be- 
tween products and consumers 

User reviews and comments, for instance, not only serve the purposes of the 
seller, but also the consumer community, while simultaneously being the means for 
identity building of reviewers themselves [37] ■ In-depth study of all these complex 
phenomena seems crucial for better understanding of contemporary "technological 
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economies" [6]. 

What kind of groupings are we interested in when we try to locate segments 
of reviewers connected by shared reviewed products? We might be tempted to talk 
about virtual communities. But these would not be "virtual communities" in the 
usual sense [78l [91] ; and they would not be "online social networks" as typically 
imagined by social scientists. Both these concepts characteristically refer to groups 
of people who directly communicate to each other with the help of computer net- 
works - i. e., who know (about) each other and interact by means of on-line forums, 
instant messaging, or facebook. Our groups of Amazon.com reviewers represent a 
slightly different kind of entities, though. These people usually do not communicate 
by addressing each other and quite often they even do not know each other. They 
do not belong to the group by virtue of intentional interaction with the others, 
but "merely" by doing similar things in a relatively uncoordinated way: writing 
reviews on specific products. If [451 144j drew our attention to the importance of 
"weak ties" in social networks, i.e., to the significance of ordinary informal acquain- 
tances (in comparison to family ties and formal hierarchies), we could speak here 
of a kind of "ultra- weak ties" . These ties are "virtual" in the sense that they are 
not "real enough" in the usual sociological meaning; yet, they are materialized and 
articulated - although not by the reviewers themselves only. We can clearly see the 
connections on the Amazon.com web pages: the reviews of these people are listed 
together, accompanying the respective item in the catalog. Moreover, the review- 
ers do not become members of this community completely unintentionally, but by 
means of quite intentional and personal act of assessing the product and writing 
the review. They create the community by highly mediated interactions, as if "by 
the way" , together and via the technology of on-line shopping. 

In the following section we present mathematical tools for identification, repre- 
sentation and elementary description of precisely such communities. The proposed 
procedures may have a value especially in relation to subsequent sociological anal- 
ysis of these local anomalies, as its precondition. 

3. Finding small communities in networks 
3.1. Motivation 

The problem of identification parts of the network bearing some relevant struc- 
tural information, can be relatively easily formulated in mathematical terms. The 
methodological problem is, which one of the variety of possible mathematical for- 
mulations of community detection is suitable for given purpose. Let us stress that 
we neither aim at improving the existing schemes nor present an algorithm which 
should compete with the established ones. Instead, we are bringing an alternative 
scheme which reveals structures, not covered by other schemes of community detec- 
tion. That is why we not only present a description of the method and its application 
to one real-world example, but also spend time putting it into a wider context of 
sociological thinking. 
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3.2. Background 

For a long time, the standard way of mathematical modeling of social networks [90( 
[23] was the "classical" theory of random graphs [HI [24] initiated by the work by 
Erdos and Renyi [32] • However, in the last decade a new class of networks became 
studied and the name "complex networks" became common denomination for them 
gl[ll[88l[27l[l2l[28l[l3- Compared with the "classical" models of random networks, 
they grasp much better the networks found in reality and at the same time their 
models arc much more involved than bare random dropping of edges as in the 
Erdos-Rcnyi model. The most immediate characteristics common to the complex 
networks is their degree distributions with power-law tails [1]. 

The strong inhomogeneity of complex networks, implicit in their degree distri- 
bution, changes many aspects of their behavior. In the context of our work, new 
approaches for finding the communities become relevant. While the methods for 
determining cliques, fc-cliqucs and motifs |90[ 123] work well if the zero-hypothesis 
on the network structure is the Erdos-Renyi random graph, methods better suited 
for complex networks were developed [5l[MllMllMl[2Zl[^[Za[lQl[Mllil39j[5j[5ll 
[38l [82l [6T1 [58l [5] [56]. The central quantity for majority of them is the modularity 
measure Q, which is to be made maximal. This is achieved by various optimization 
algorithms. 

Here we will rely on the method of describing the global properties of networks 
using the spectral theory of graphs [20] . It deals with eigenvalues and eigenvectors of 
various matrices representing the graph structure, which arc the adjacency matrix, 
Laplacian and more. It was already used for finding clusters or communities in 
networks through the properties of eigenvectors corresponding to the second largest 
eigenvalue [67] [TB] [211 El] . In one step it gives the best partitioning of the network 
into two modules and repeating the algorithm recursively, the communities are 
found. Our approach is different, though. It is similar in spirit to the analysis of 
covariancc matrices in finance |731 [75], where economic sectors arc attributed to 
eigenvectors corresponding to the second, the third, etc. largest eigenvalue. 

The first level of understanding spectral properties of a random matrix comprises 
the knowledge of the density of eigenvalues. The second involves the localization 
properties of the eigenvectors. It is the latter that is central for our approach. 

Let us say first a few words on the eigenvalue density. Spectra of "classical" 
random graphs, like the Erdos-Renyi model, are closely related to "classical" models 
of random matrices |64| . The typical shape of the eigenvalue density is the Wigner 
semi-circle with sharp edges, with the largest eigenvalue split far off from the bulk 
of all other eigenvalues. The first complication arising in the spectrum of a random 
graph is the sparseness of the adjacency matrix, which leads to the emergence of 
Lifschitz tails. This appears already in the Erdos-Renyi model. Despite considerable 
effort [75], the Lifschitz tail in ER graph is still not known in all details. Asymptotic 
formula was obtained by several approaches, showing that the density of states is 
non-zero at arbitrarily large eigenvalues and it decays faster than any power law 
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[zaiM]. 

It was soon realized that complex networks, characterized by power-law degree 
distributions, have also non-standard spectral properties [Ml 1121 [29l |30l [33l [7T1 
ElESlEIllMlEllEIlIIilMlESlEe]. First, there is a cusp in the middle part 
of the density of eigenvalues, and second, perhaps more importantly, the tail of 
the eigenvalue density seems to be described by a power law [SH |42| . Numerical 
diagonalization on toy models [29l[30] as well as some analytical estimates confirmed 
power-law tails in the density of states. The replica trick [80l [65], as well as the 
cavity method [30l [81] were later adapted for scale free networks. It was found that 
the spectrum has a power-law tail characterized by the exponent 27 — 1 related to 
the degree exponent 7 of the network. Further improvements of the method were 
introduced recently [50l [10] . 

As we shall see, our method is similar to those used in the study of covariance 
matrices of stock-market fluctuations I4ll[55l[73l[74l[75l[89l[3l[48l[l3]. They are 
modeled as random matrices of the form MM"^, where M is a random rectangular 
matrix. The density of states has the Marcenko-Pastur form [63] with sharp edges, 
which are smeared out into Lifschitz tails if the matrix is sparse |66j . 

Most attention was paid to the states in the tail, i. e. located beyond the edge of 
the Marcenko-Pastur density and below the maximum eigenvalue, which is always 
split off. These states are supposed to carry the non-trivial information about the 
stock market and, indeed, the shape of the corresponding eigenvectors was used to 
identify business sectors. It was supposed that the eigenvectors were localized on 
items within specific sector [331 [75] [S2] ■ More sophisticated approaches were also 
developed [89] . 

Our method owes largely to the spectral analysis of covariance matrices. How- 
ever, we improve these approaches by systematic use of the quantitative measure of 
localization of the eigenvectors, which is the inverse participation ratio. In an intu- 
itive manner, similar approach was already used in the analysis of gene coexpression 
data [49| . Within this approach, we do not aim at factoring the entire network into 
some number of modules, or communities, which may or may not be overlapping, 
but in any case covering, as an ensemble, the whole network. Instead, we want to 
find small parts of the network which differ structurally from the rest. We may also 
describe our approach as "contrast coloring" of the network, which makes certain 
relevant parts visible against the irrelevant background. 

3.3. Spectral analysis of matrices encoding the structure 

Our analysis will be devoted to bipartite graphs. There are two types of nodes, 
making up sets TZ and I. Anticipating our application to the Amazon.com network, 
we think of members of TZ as reviewers and members of X as items to be reviewed. 
All information on the network structure is contained in the adjacency matrix M 
with elements Mri £ {0,1}. The out-degree of node r G TZ is kr = 'Y^^M^i, the 
in-degree of the node i e I is fc^ = Mri- 
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In bipartite graph, the spectral properties are deduced from the contracted ma- 
trices B = M -M^ and C ~ Al'^ -M. The interpretation of these matrices is obvious; 
6. g. Brs tells us how many neighbors the nodes r and s have in common. Similar 
construction is used frequently in bipartite networks. As an example, let us cite the 
network of tag co-occurrence in the analysis of collaborative tagging systems [T8] 
or recommendation algorithms investigated in |95j . 

In order to partially separate the effects of the network structure from the influ- 
ence of degree distribution, we rescale the matrix elements by the product of square 
roots of the out-degrees. This way, we get the matrix 

A^, ^ (1) 
with all diagonal elements equal to 1. We can also be more explicit and write 



Ars = ( ^ri Msi) /y {J2i Mri) ( J2i Mgi) ■ Obviously, the matrix A is symmetric. 

The matrix A is then diagonalized. Let us see what information can be extracted 
from the eigenvalues and eigenvectors. First, for any square N x N matrix D encod- 
ing the structure of a graph we can interpret the traces j^TiD'^ as density of circles 
of length k. This number is equal to the k-th moment of the density of eigenvalues 
of D. In our case, the role of D is assumed by the contracted matrix B and the fc-th 
moment of B expresses the density of cycles of length 2k on the bipartite graph. 
If we use the matrix A instead, the moments of the spectrum are related to the 
density of weighted cycles. Each time the cycle goes through the vertex r £ TZ, 
it assumes the weight l/kr- Therefore, cycles connecting vertexes with large de- 
gree are counted with lower weight. This is just what we want here: to put accent 
on peripheral, less-connected areas of the network, rather than on the hubs. If we 
did not rescale the matrix as in ©, the weight of the hubs, or strongly-connected 
nodes in general, would overshadow the major part of the network, where the small 
communities may lie hidden. 

We expect that the spectrum has power-law tail. Indeed, it will be confirmed in 
the specific example of Amazon.com, which we shall show later. The power-law tail 
implies that the density of cycles beyond certain length diverges. In terms of the 
limit iV — > cx) it means that the number of such cycles increases faster than linearly 
with N. The exponent of the power-law tail tells us what is the threshold for the 
cycle length, beyond which the cycles are anomalously frequent compared to the 
Erdos-Renyi graph. 

What does all this mean for the problem of finding small compact communities? 
If we for example use the method of cliques or fc-cliques, we tacitly assume that the 
"background" network does not contain many of these cliques by pure chance. But 
if, for example, the tail of the spectrum of D has exponent 4, the third moment 
diverges, which means that there are extremely many triangles. No triangle, or 
community of size 3, can therefore be considered as informationally relevant. That 
is why we consider the information on the spectrum of the network an important 
auxiliary information. 
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The new algorithm we propose for finding small compact communities relics on 
the properties of the eigenvectors. Let us denote e\r the r-th element of the eigen- 
vector of the matrix A, corresponding to the eigenvalue A. To study the localization, 
we need to calculate the inverse participation ratio (IPR) defined as 

N 

g-i(A) = ^(e,.)4 (2) 

r=l 

where normalization X^^Lil^Ar-)^ = 1 is assumed. 

While IPR says quantitatively to which extent an eigenvector is localized, this 
information alone is not sufficient, if we want to draw the distinction between lo- 
calized and extended states. First, it makes no sense to ask, if a particular vector 
is localized, as opposed to extended, or not. What does make sense, however, is 
the question whether the states around certain eigenvalue are localized. The way 
to establish that fact is by finite-size analysis. Indeed, if is the dimension of the 
vector space we work with, then 

{0(1), N ^ oo localized state 
/ \ (3) 
0(iV-M, ^ oo extended state . 

Second, also the shape of the density of eigenvalues changes with the system size. 
When we increase N, the spectrum broadens. In the textbook example of Erdos- 
Renyi graph, the spectrum has sharp band edges. The edge of ER spectrum moves as 
when N grows and if we compare the IPR at different system sizes, we must 
measure the eigenvalues relative to the band edge. So, to compare the behavior 
at different sizes, we take random subset of the network, containing A^sub nodes. 
Typically, we choose A^sub ~ N/2. Then, we plot the density D{X) of eigenvalues for 
both original network and the density DsuhW for the random subset. The densities 
are rescalcd by the factor s, the value of which is found empirically so that the 
data for D{X) and Dsub(l + (A — l)s) overlap as much as possible. The form of 
this rescaling involves the shift of the eigenvalues by 1, because the matrix A has 
spectrum centered around the value A = Arr = 1- With s found, we plot the IPR 
for the network and the subset, with the same rescaling as used for the eigenvalues 
density. The regions, where we observe that q~^{X) remains roughly the same for 
the network and its subset, are the candidate areas where the localized states are 
to be found. 

We continue the procedure by determining the eigenvectors with largest 
within the areas of localized states. The elements of these vectors tell us what nodes 
of the network belong to the small community. To this end, we fix a threshold T 
and retain only those nodes r £ TZ ioi which the elements exceed the threshold in 
absolute value, |eAr| > T. We do not propose any exact method for fixing T. For 
the sake of consistency, T must be chosen so that the number of nodes retained is 
roughly In practical applications we observed the number of retained nodes 

when T was gradually decreased from T = 1. At certain crossover value of T we saw 
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that the number of nodes suddenly started increasing substantiaUy to much larger 
values than So, we fixed T somewhat below this crossover. We believe that 

this procedure could be made automatic by a software implementation, but we did 
not do that. 

Let us make an important remark at this point. Clearly, we can find some 
localized states also in a randomized version of the network. These states are results 
of pure chance and do not bear significant information. Therefore, we cannot exclude 
that also in the true empirical network, some of the localized states occur just 
accidentally and thus some of the clusters found are spurious. The choice of the 
threshold T only cannot discriminate between the true and the spurious clusters. 
However, looking at the dependence of IPR on eigenvalue for the true and the 
randomized network (as will be seen later in Fig. |4]) we can see the regions where 
IPR is large and differs markedly between true and randomized networks. The 
localized states found in these regions (in Fig. |4] it is near the lower edge of the 
density of states) correspond to clusters that are non-random and do bear relevant 
information. 

This way we find those vertexes r G TZ, which form the community Ctz = {r G TZ : 
\e\r\ > T}. Next, we proceed by finding those i gX which are connected to them. 
Here we can distinguish two levels. First, a vertex z € I can be connected to at least 
two different vertexes from Cn. Then, we say that it belongs to the connectors of 
the community, C™" = {i e 1 : 3r, s G C-jz ■ r s A Mg,; = Mri = 1}. Further, those 
i G I which are connected to just one vertex of C-jz form a more weakly bound part 
of the community, which we call cloud, Cf'°"^ = {i e 1 : 3r e Ctc\C|°" : A M„ = 1}. 
We can explicitly see the asymmetry in constructing the community. This is due 
to the fact that we focused on the diagonalization of the contraction matrix acting 
in the space TZ. The procedure can be, of course, performed also in the opposite 
direction, diagonalizing the contraction on I. Both ways are equally justified on 
the formal level. The choice should be dictated by practical reasons and by the 
interpretation we want to draw from the data in any specific application. 

To sum up, our procedure for finding small communities in bipartite networks 
consists in the following steps. 

1. Diagonalize the matrix A, Ars = ( J2i ^'^n Msi) / \/ ^'^ri) ^'^si) ■ The out- 
put is the density of states D{X) and the inverse participation ratio q^^{X). 

2. Do the same for random subset of the network, containing half of the nodes, 
find the proper rescaling factor s, so that rescaled density of states for the network 
and the subset coincide. By rescaling the IPR using the same factor s, determine 
the regions, in which localized states are to be found. 

3. Within the localized region, find the eigenvectors with highest IPR. 
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4. For each of the eigenvectors found, determine the threshold T and cstabhsh 
the set C-jz of nodes r, for which |eAr| > T. This set is the projection of the commu- 
nity to the set TZ. 

5. Find the connector and cloud components of the community on the side of the 
set I. 



4. An example: Reviewing networks on Amazon.com 
4.1. Basic structural features 

The e-commerce site Amazon.com is one of the oldest and best known on the WWW. 
It has a very rich internal structure, but the user usually sees only a small part 
relevant to the service requested in a particular moment. As already announced, we 
shall investigate one aspect of the Amazon.com trading, namely the network made 
up of connections between the items to be sold and the reviewers who have written 
reports on these items. 

This network is a bipartite graph, with items i = 1, 2, . . . , A'itm on one side 
and reviewers r = 1,2,... , Aiov on the other side. The sets of vertexes TZ and I 
introduced in the methodical section above, correspond to the sets of reviewers and 
items, respectively. 

The reviews written are edges connecting these two sets. The structure of the 
network can be uniquely described by the matrix A/, where the element Mri equals 
1 if the reviewer r wrote a review on item i, and otherwise. 

The data were downloaded using a very simple crawler in the period from 28 
July 2005 to 27 September 2005. First, a list of total A'aii = 1 714 512 reviewers was 
downloaded; at that time the list containing all Amazon reviewers was accessible 
through the web. (It is no more so.) The list was naturally ordered by the rank Ama- 
zon assigns to each reviewer. On average, reviewers with higher rank have written 
more reviews, but there are exceptions. For example, at the time of data collection, 
the No. 1 reviewer, Harriet Klausner, had written 9581 reviews, while the No. 2, 
Lawrance M. Bernabo, 10603 reviews. This suggests that it is not only quantity 
but also quality which counts when Amazon ranks their reviewers. We do not touch 
here the obvious question how the most prolific reviewers do manage reading and 
reviewing several books per day, throughout many years. As we investigate only 
structural features here, these problems are left aside. 

In the next step, we went through the reviewers' list, from the top rank down- 
wards. We looked only at about 10^ first reviewers and stopped there, as we con- 
sidered the sample sufficiently representative. The remaining reviewers are only 
occasional writers, contributing by one or at most a few reviews. For each reviewer 
we found all reviews written by her or him and registered the name of the item 
reviewed (mostly books and CD's, of course, but in general all kinds of goods do 
appear) as well as some other details about the review. In total, we examined 99 622 
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reviewers who wrote 2 036 091 reviews on 645 056 items. 

4.2. Degree distributions 

The simplest and most accessible local property of the network is the degree dis- 
tribution. In the list of reviewers we put down also the reported number of reviews 
written by the particular person. We neglected the possible error in this number 
due to various inconsistencies. We believe that the random discrepancies between 
the number of reviews reported and number of reviews which can actually be found 
in the system do not influence the statistics in any significant measure. We show the 
distribution as out-degree distribution in Fig. [TJ We can observe clear power-law 
dependence except for the few highest degrees. The exponent fitted is 7out — 2.2. 

Similarly we can extract the in-degree distribution from the list of reviews. The 
statistics of the number of reviews per item is also shown in Fig IT] and a power-law 
dependence is found again. The corresponding exponent is now — 2.35. 

The power distribution is by no means surprising, in view of the vast literature 
on complex networks. The data provide a clear check that Amazon.com also belongs 
to the class of networks with power-law degree distribution. 

4.3. Distribution of eigenvalues 

Now we are in a position to calculate the contraction matrix A acting on the set 
of reviewers, and diagonalizc it. As an additional study, we compare the results 
with randomized version of the reviewer-item network. This way wc discriminate 
between the influence of the network structure and genuinely random factors. 

To this end, we reshuffle the edges in the reviewer-item graph, while keeping 
the degrees of all vertexes unchanged. The matrix M is replaced by and, 
correspondingly, the matrix A is replaced by A^\ Again, we can write A^^ — 
( Si ^ri ^si) /Vkr kg. The only information on the network structure retained here 
is the order sequence. As we showed in the last section, it obeys a power law, so the 
features found in analyzing A^ are entirely due to power-law degree distribution, 
but without further structural details. 

We diagonalize the matrices A and A^-. Their eigenvalues Ai > A2 > . . . > Xn 
are accumulated around the value A = 1, which corresponds to the uniform diagonal 
value of both the true and the randomized matrices. The distributions are plotted 
in Figures [21 and [3l Let us describe now what we can see here. 

In Fig. [2] we plot the histogram of the eigenvalues of the matrix A. Most of them 
fall within the interval A € [0,3], with sharp maximum in the eigenvalue density 
at A = 1. The eigenvalues density is much smaller for A > 3 and we show their 
positions as separate ticks. Although the notions "bulk" and "tail" are not very 
precise here, we shall use them pragmatically, calling bulk the part with A < 3 and 
tail the part with A > 3. 

In Fig. [5] we can also see the spectrum of the randomized matrix A^ . The power- 
law distribution of degrees is preserved. In the spectrum, we can observe certain 
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remarkable changes. In the bulk of the density of states, as shown in the inset of 
Fig. [21 the spectrum of the reshuffled network lacks the characteristic tip at the 
value A = 1 and its shape at the lower end of the spectrum is quite different. Most 
importantly, a sharp band edge develops. On the other hand, at the upper tail of 
the density of states, the changes are of minor importance. 

In Fig. |3]we can compare the behavior of integrated density of states, D-^{X) = 
Si A >A ^^^^ region of large eigenvalues. For the original matrix A we observe 
a power-law decay in the tail Z?^(A) ~ (A — 1)^^ with t ~ 2. For the matrix the 
tail is again quite reasonably fitted on a power law, but with larger exponent. Let us 
recall that the divergence of the moments of the eigenvalue density is related to the 
statistics of cycles on the network. For the reshuffled network, the divergence occurs 
at higher moments, therefore at cycle lengths longer than in the original network. 
This effect seems to be a tiny one, but this is just a subtle structural difference 
which goes beyond the bare degree distribution. In short, the Amazon network has 
many more short loops than how many could be expected knowing only its degree 
sequence. This suggests the presence of small self-reinforcing communities. Although 
we do not see them yet at this stage, we can perceive their existence through the 
density of states of the matrix A. 

Interestingly, similar conclusions about small communities were reached in the 
study of collaborative tagging systems (TB] , where two- node correlations were calcu- 
lated in order to estimate the quantity of non-randomness, or semantic information 
content. 

4.4. Localization 

Having investigated the eigenvalues, let us now turn to the properties of the eigen- 
vectors. We show in Fig. U] how the IPR depends on the eigenvalue. For the matrix 
A we can see larger localization around the center of the spectrum at A = 1 . Farther 
from the center the localization is weaker, but it increases again at the tails, more 
strongly at the lower tail, while more gradually at the upper tail. Note also some 
isolated highly localized states in the bulk of the eigenvalue distribution. 

Now we compare the results with the random subset of A^sub = 5000. We found 
that the density of eigenvalues coincides very well if we choose the scaling factor 
s = = ^ N/Nsuh- With the same scaling we plot the IPR in Fig. [5l We can see 
that the absence of a clearcut band edge is complemented by the absence of any 
region of localized states at the upper end of the spectrum. The lower end does show 
localized states, though. Therefore, the candidates for compact communities are to 
be found close to the lower end of the spectrum. In the next section we describe 
what we have found there. 

5. Finding and interpreting the communities 

As we have said, the most localized states are the candidates for small and densely 
interlinked communities of reviewers. We counted as members of the community 
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Fig. 1. Degree distribution of the bipartite reviewer-product network on Amazon.com. Circles 
indicate the data for out-degree (reviews per reviewer), triangles for in-degree (reviews per item). 
The latter data are shifted rightwards by one decade for better visibility. The linos arc the power 
laws oc fc~^'^ (dashed line) and oc k~^'^^ (solid line). 




Fig. 2. Distribution of eigenvalues of the reviewer-reviewer matrix. The size of the segment is 
A'^ = 10000. For A < 3 the distribution is plotted as a histogram, while the larger eigenvalues, 
A > 3 are shown as individual vertical ticks. The largest eigenvalue is indicated by the circle. In 
the inset we show the detail of the central part of the same plot. Also in the inset, the dashed 
(green in color) line is the distribution of eigenvalues of the matrix obtained by reshuffling the 
reviewer-item graph. 



only those reviewers, whose element in the eigenvector was larger than a threshold, 
|eAr| > T. The value of the threshold T was found by trial-and-error, so that all 
relevant nodes, on which localization appears, were kept, while the remaining ones. 
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Fig. 3. Detail of the upper part of the distribution of eigenvalues. The behavior is observed using 
the integrated density of eigenvalues. Circles correspond to the original reviewer-reviewer matrix 
with N = 10000, the triangles correspond to the same matrix subject to permutation of all its 
elements. The full line is the power oc (A — 1)~^, the dashed line is the power oc (A — l)""^'*. 




Fig. 4. Inverse participation ratio for the reviewer-reviewer matrix with A'^ = 10000 (O), and the 
same for matrix obtained by reshuffling the reviewer-item graph {-!-). Each point denotes the IPR 
for the eigenvector corresponding to the indicated eigenvalue A. 



interpreted as a noisy neighborhood, were left out. This adjustment of thresholds 
also indicates that the borders of the communities found in this way are not sharp. 
In our set of 10^ reviewers, the number of communities which can be considered 
as well-localized is about ~ 10. We were able to explicitly draw and interpret 7 
communities. With average size of the communities around 6 people, the fraction of 
reviewers in small compact communities can be estimated to about 0.5 per cent. In 



January 25, 2013 15:49 WSPC/INSTRUCTION FILE 

3laninakonopasckl00920 



Slanina and Konopdsek 15 




Fig. 5. Inverse participation ratio for the reviewer-reviewer matrix. The horizontal axis is rescaled 
by the factor s explained in the text. We show the data for the matrix with A'^ = 10000 ( +, s = 1), 
and for the random subset with N^^^ = 5000 of the same matrix (x, s = \/2). Each point denotes 
the IPR for the eigenvector corresponding to the indicated eigenvalue A. 



other words, we have been able to find relatively rare cases when fractional segments 
of the network display anomalously high density of mutual links. However, we expect 
that this fraction would rapidly grow if more reviewers are included from the top 
of the Amazon list downwards. From this point of view the small percentage of 
the reviewers in small communities is partly an artifact due to the choice of the 
reviewers starting from the top of the list of the most productive Amazon.com 
reviewers. 

Now, let us look at several specific examples of the communities found. The first 
example of such a small grouping is shown in Fig. [6l (In this case we took the 5th 
most localized vector, = 0.095675, corresponding eigenvalue A = 0.359, and the 
threshold was taken as T = 0.2.) The items reviewed by the reviewers within the 
community found in this way are of two types. First, there are those reviewed by at 
least two reviewers from the community. These items keep the community together 
and we call them "connectors" . We show them in Fig.|6]linked to their corresponding 
reviewers. However, one should note that the reviewers themselves play the role of 
"connectors" for the items, to the same extent as the items are "connectors" for the 
reviewers. Second, there are items reviewed by only one reviewer of the community. 
These items form a kind of "cloud" around the core of the network segments. We 
do not show the "cloud" in our figures, but we shall discuss its meaning later. 

What are the product-connectors in the given community (Fig. [B])? We can see 
that the maximum number of reviewers for one item is 4 and it holds for two audio 
recordings: "The Beatles (The White Album)" and "Abbey Road" also by Beatles. 
Thus, the core of the community is kept together by one of the most popular music 
bands ever. The remaining items are thematically close. They refer to other records 
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Led Zeppelin III 
Houses Of The Holy 
Led Zeppelin IV 
Led Zeppelin 1st 
Sticky Fingers 



Another Side of Bob Dylan 
John Wesley Harding 

Nashville Skyline 
Self Portrait 
New Morning 
Blood on the Tracks 
Desire 

Street Legal 

Slow Train Coming 

Planet Waves 
Knocked Out Loaded 
Under the Red Sky 
Time Out of Mind 

The Times They Are A-Changin' 
Bringing It All Back Home 
Highway 61 Revisited 
Pat Garrett & Billy the Kid 
The Basement Tapes 
Shot of Love 
Infidels 
Oh Mercy 
World Gone Wrong 
Empire Burlesque 
Good as I Been to You 
Down in the Groove 




Imagine 
Venus and Mars 
Past Masters, Vol. 1 
Ram 

McCartney 



Abbey Road 
The Beatles (The White Album) 



Plastic Ono Band 
Milk and Honey 

Tug Of War) 



Beatles for Sale 

A Hard Day's Night (1964 Film) 

With the Beades 

Past Masters, Vol. 2 

Magical Mystery Tour 

Let It Be 

Double Fantasy 

The Beatles 1 

All Things Must Pass 

Yellow Submarine (Songtrack) 

Sgt. Pepper's Lonely Hearts Club Band 

Help! 

Run Devil Run 
Yellow Submarine 
Please Please Me 
Rubber Soul 



Bob Dylan Real Live [In Europe, 1984] 

Bob Dylan: MTV Unplugged [Live, 1994] 

The Bootleg Series, Vol. 4: Bob Dylan Live, 1966 



Fig. 6. The "pop-music" community in the network producing a very locahzed eigenvector of the 
matrix A. In the middle, code-names of the reviewers, on the right, recordings by The Beatles 
(mostly as a band, some other by individual members), on the left, recordings by Bob Dylan, with 
exception of the shaded box which contains four times music by Led Zeppelin and once Rolling 
Stones. 



by Beatles and also by Beatles ex-members, or to the music of Bob Dylan. (Ex- 
)Beatles and Dylan cover about a half of the items each. The only exception is 
a small set of five recordings of other pop-classics, namely four of Led Zeppelin 
and one of Rolling Stones. In short, all items fall into the range of notoriously 
known pop- music stars. It is interesting that this characteristic does not concern 



January 25, 2013 15:49 WSPC/INSTRUCTION FILE 

slanmakonopasckl00920 



Slanina and Konopdsek 17 



Buffy the Vampire Slayer - The Complete First Season 
Buffy the Vampire Skyer - The Complete Sixth Season 



Gladiator 

The Last Samurai 



The Adventures of hidiana Jones 
Terminator 3 - Rise of the Machines 
Pirates of the Caribbean - The Curse of the Black Pearl 
Kill Bill, Volume 1 

The Lord of the Rings - The Two Towers 



The Lord of the Rings - The Return of the King 
Star Wars - Episode 11, Attack of the Clones 
The Matrix Reloaded 
The Matrix Revolutions 




The X-Files - 
The X-Files - 
The X-Files - 
The X-Files - 
The X-Files - 
The X-Files - 



- First Season 
Second Season 
Third Season 
Fourth Season 
Fifth Season 
Ninth Season 



The X-Files - Seventh Season 
The X-Files - Eighth Season 



The X-Files - Sixth Season 



Fig. 7. The "pop-movie" community in the network producing a very locahzed eigenvector of the 
matrix A. In the middle, code-names of the reviewers, on the right, the X-Files series, on the left, 
other popular movies. 



Shut Up and Sing 
Liberalism is a Mental Disorder 



The Price of Loyalty 
Rome Wasn't Burnt in a Day 
Plan of Attack 
All the President's Spin 
Reason 



The Republican Noise Machine 
What's the Matter with Kansas? 
Lies and the Lying Liars Who Tell Them 



Deliver Us from Evil 



Donnie Brasco 



David S. Rhodes 



Steven E Rustad 




Where the Right Went Wrong 



Unfit for Command 

A National Party No More 



Slander, by: Ann Coulter 



Treason, by: Ann Coulter 

How to Talk to a Liberal (If You Must), by: Ann Coulter 



The Truth About Hillary 



Fig. 8. The "pop-politics" community in the network producing a very localized eigenvector of the 
matrix A. In the middle, code-names of the reviewers, on the left and right, books treating mainly 
the clash of (nco-) conservatives versus liberals in the USA. Note that Ann Coulter is the most 
prominent book author in this community. 



the connecting items only, but majority of all other reviews by the members of the 
community (not included in the graph). Thus, not only the connectors, but also the 
"cloud" bears the same characteristics. 

Therefore, the interest of these reviewers lies, in general, within a rather narrow 
scope determined by the pair Dylan-Beatles, with some small excursions farther 
into mainstream pop-music, similar to the small "Led Zeppelin" set in Fig. [S] For 
example, the reviewer gdb has also written on CD's by U2 and David Bovie, while 
the "cloud" reviews by Cristian Domarchi (not listed in Fig.[S|) pertain only to other 
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recordings by cx-Bcatlcs plus one book; among all the 6 reviewers, only Stephanie 
Sane shows interests which go clearly beyond the Dylan-Beatles repertoire, review- 
ing a good deal of books, mostly mystery and detective fiction. 

A similar picture is provided by the analysis of other communities. Let us very 
briefly describe two more of them. 

The first one (Fig. [7]) belongs to another pop-cultural domain, this time con- 
centrating on DVD movies with a sci-fi and fantasy flavor. Again, we found that 
the reviewers are active within rather narrow bounds. They focus on widely popular 
titles, overlapping very little with any other possible themes or genres. Only a small 
part of the reviews by the members of the community are related to something else, 
e.g., to books by M. Proust and T. Mann. 

The third and last example we want to mention is shown in Fig. \8\ In analogy 
with the former examples, the "pop-music" and "pop-movie" communities, we may 
call this one a "pop-politics" community. The reviewers here concentrate on books 
discussing the presidency of G. Bush, the evils of liberal ideology, as compared with 
neo-conservatism, and so on. The core of the community is kept together by the 
books of Ann Coulter, who is known as a militant anti-liberal writer. Majority of the 
books in this group is targeted at the widest public, as is the music by The Beatles 
and movies of the "X-Files" type. Their themes are not esoteric, these products are 
not aiming at specialized audiences; yet, the zeal of the reviewers makes a "cult" of 
them. Again, this community is narrowly defined by the interest in these popular 
issues and not much else. In the "cloud" of other items reviewed by the members 
of this community we find some other books by Ann Coulter, accompanied by 
books such as (the titles are self-explaining, we believe) Worse Than Watergate: 
The Secret Presidency of George W. Bush; Blinded by the Right: The Conscience 
of an Ex-Conservative; A Matter of Character : Inside the White House of George 
W. Bush; The Family : The Real Story of the Bush Dynasty; Chain of Command 
: The Road from 9/11 to Abu Ghraib, and similar. Out of the six reviewers, only 
Donnie Brasco shows some additional field of interest, having written about various 
pop-music CD's as well. 

Let us sum up these observations (supported also by analyses of other small 
communities we were able to find in the sample). Our expectations that strongly 
localized eigenvectors would reveal some specific small communities was fulfilled in 
the sense that we have indeed found groups of zealots, concentrated on a relatively 
narrow segment of commodities sold on Amazon.com. Individual interests of these 
reviewers only scarcely reach beyond the theme common to the community. 

On the other hand, however, it would be misleading to imagine these people 
as eccentrics focused on highly specialized, marginal or even extreme cultural arti- 
facts. The subjects of their reviews are quite ordinary, clearly part of the cultural 
mainstream. And, by their tastes, the reviewers themselves seem belonging to wide 
audiences, often focused on classics or well-established pop-cultural products. In 
other words, anomalous tiny fragments of this huge network, characteristic by var- 
ious authors repeatedly writing reviews on the same items, refer typically not to 
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some marginal cultural forms with specialized contents, but rather to widely shared 
cultural tastes and mainstream enthusiasts. 

A more detailed analysis of these findings is beyond the scope of this method- 
ological paper and its analytical illustration. Very probably, several possible ex- 
planations could turn valid in parallel, including the nature of the Amazon.com 
portal (primarily designed for general audiences and as wide consumer population 
as possible), possibly higher probability that reviews on widely favored artifacts 
get "localized" etc. What should perhaps be stressed here, however, is the pecu- 
liar character of the communities or network segments under discussion. It is clear 
that the tiny network fragments counting 5 or 10 reviewers and dozens of reviews 
cannot represent "big" consumer populations and "widespread" artifacts in some 
straightforward way. Rather, they may provide a rather specific ("small-scale") way 
of looking at a mass-scale phenomenon. Let us tell something more about this speci- 
ficity. 

We have already noted that the network and its segments we are studying is 
not a "social network" as traditionally envisaged. The interaction constituting the 
network is so massively mediated and by-produced (while remaining observable, 
"real" enough and grounded in intentional social action) that we leave the territory 
of what is usually counted by social scientists as a "group" or "community". But 
even more is at stake in this direction. A closer view of our findings reveals that 
one cannot unambiguously say whether the "connecting" reviewed products pro- 
vide interpretive framework for statements about the reviewers, or whether - on 
the contrary - it is reviewers and their actions that provide clues for interpreting 
communities of products. In other words, we are unable to determine whether we 
study groupings of people (connected by products) or of commodities (connected 
by people). In fact, we should better try to understand both within a single hybrid 
network, meaningfully connected. While studying phenomena of product reviewing, 
products and reviewers cannot be separated. The sets of products represented in 
our figures (Fig. I6I7I8P do not simply make sense (and do not hold together) without 
the reviews written about them by the represented reviewers. Indeed, the products 
grouped by, e.g., purchases carried out by Amazon.com users would look diff'erently. 
On the other hand, the groups of reviewers would not make sense without the par- 
ticular reviewed products (their amount and nature). Thus, we believe the segments 
identified in our example can directly represent neither populations of consumers 
nor entire sections in the Amazon.com commodities catalog. Rather, they repre- 
sent, in a complex way and as if under a specific lens, a phenomenon of on-line 
user reviewing, better understanding of which may contribute to our knowledge of 
contemporary popular culture and technologically mediated economic processes. 

6. Conclusions 

Thanks to numerous sociological efforts in the field of social network analysis as 
well as the work on networks done in other scientific disciplines such as theoretical 
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physics, various mathematical tools have been developed. They aim cither at de- 
termining large-scale structures in complex networks or at identification of smaller 
network segments such as cliques or acquaintances. In this work, we introduced a 
new mathematical procedure relatively close to the latter type of task. We believe 
the method is well suitable for finding the most relevant small segments of complex 
networks, when "relevance" cannot or need not be equaled to some absolute level 
of mutual connectivity between the nodes. We argue that this is often the case, 
because important social forces or processes are often related to highly mediated 
and heterogeneous groupings, typically constituted as by-products of various, dif- 
ferently oriented actions, and where people characteristically and usually do not 
intentionally address each other and even do not know each other (here, we could 
speak of "ultra- weak" ties) . The proposed method based on well- localized eigenvec- 
tors is well capable to find these small communities with anomalously high density 
of mutual links and therefore reveal a kind of semantic information hidden in the 
network, otherwise often neglected. As such, our method may be a good starting 
point for more fine-grained further analysis of given phenomena. 

As an empirical example, we have chosen the data available from the Ama- 
zon.com on-line shopping portal. We studied the network constituted by users writ- 
ing reviews of the same products offered for purchase on the website during the 
summer 2005, when the data were gathered. Reviewers become connected if they 
have written a review on an identical item. When such connections locally prolifer- 
ate we get a grouping of relevance. 

These groupings are not directly related to the top-lists of popularity, but reveal 
the most focused points in the network. They are constituted by socially rather 
distant ties, i.e., by a kind of ultra- weak ties, namely highly mediated links by- 
produced during processes primarily aimed at something else than addressing each 
other to establish acquaintance or become closer. 

The first important result of our analysis is the power-law tail in the density of 
eigenvalues. This feature is partially, but not entirely, due to the power-law degree 
distribution. Comparing the spectrum arising from the network with the spectrum 
of a random network with the same degree sequence, we find a power-law tail in both 
cases, but the exponent is significantly smaller in the original network. Generally, 
such a tail implies that the density of cycles beyond certain length diverges when 
the size of the network tends to infinity. The difference in the exponent means that 
some shorter cycles keep finite density in the randomized network, while in the 
original one they are much more abundant. This means that the Amazon network 
contains much more compact groupings than what would be expected knowing only 
its degree sequence. 

To see at least some of these small groupings, we looked at well-localized eigen- 
vectors. These localized states represent small communities or network segments 
and bear semantic information hidden in the network. We call them "hot spots", 
as they represent local structures which differ from the surrounding background. 
We were able to explicitly find some of these communities and attribute meaning to 
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them. The three of them briefly discussed in this paper can be labeled as pop-music, 
pop-movie, and pop-politics communities. The reviewers of these communities are 
very strongly focused on one narrow segment. This segment itself belongs usually 
to mass or popular culture, so it cannot be considered as marginal or esoteric. It is 
the enthusiasm of the reviewers which singles the segment out of the sea of millions 
products traded on Amazon.com. 

Our analysis shows that only about half per cent of the reviewers belong to these 
network segments in the small sample of 10"* reviewers. However, we expect that 
this fraction would rapidly grow if more reviewers are included from the top of the 
Amazon list downwards. If carefully treated and interpreted the identified network 
segments may be useful for enhancing our knowledge of mass or popular culture 
and complex economic processes related to E-consumerism. Especially, it would be 
interesting to make systematic classification of the small communities. 

Besides these specific findings we would like to highlight another, more general 
feature. When analyzing the chosen example, it turned out that conventional talking 
about "networks of reviewers" might be sociologically misleading. Our groupings, in 
fact, were constituted not only by people writing reviews on the same products, but 
also (simultaneously) by products reviewed by the same reviewers. That is why we 
decided to switch to a more appropriate term "networks of reviewing" . This term 
indicates the hybrid nature of networks we have been dealing with and it allows 
better talking about processes of online economy rather than on bare structures 
composed of its human agents. In this respect, our approach is well compatible 
with the currently increasing emphasis on heterogeneity as an essential quality of 
collectivities studied by social scientists [60] . 

The method can be applied in a straightforward way to any kind of network, 
whereever the data can be collected easily. However, technical limitations of the 
method may arise in networks larger that several tens of thousands of vertices, due 
to computer memory limitations. As shown also by the example of Amazon.com, 
on-line networks are often larger than that. Then, we must decide which subset of 
the whole network can be considered representative. In our case we chose the subset 
of the most productive reviewers, but other networks might require other criteria. 
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